`
The command also supports regular expressions, anchoring, grouping, and
much more. Use the man grep command to read more about its capabilities.
Filtering with awk
The awk command is a data processing and extraction Swiss-army knife. You
can use it to identify and return specific fields from a file. To see how it works,
take another close look at our log file. What if we needed to print just the IP
addresses from this file? This is easy to do with awk (Listing 2-25).
$ awk '{print $1}' log.txt
Listing 2-25
Printing the first field
The $1 represents the first field of every line in the file, where the IP
addresses are. By default, awk treats spaces or tabs as separators or delimiters.
Using the same syntax, we can print additional fields, such as the timestamps.
Listing 2-26 filters the first three fields of every line in the file.
$ awk '{print $1,$2,$3}' log.txt
Listing 2-26
Printing the first three fields
Using similar syntax, we can print the first and last field simultaneously. In
this case, NF represents the last field (Listing 2-27).
$ awk '{print $1,$NF}' log.txt
Listing 2-27
Printing the first and last field
We can also change the default delimiter. For example, if we had a CSV file
separated by commas, rather than spaces or tabs, we could pass awk the -F flag to
specify the type of delimiter, as in Listing 2-28.
$ awk -F',' '{print $1}' example_csv.txt
Listing 2-28
Printing the first field using a comma delimiter
We can even use awk to print the first 10 lines of some file. This emulates the
behavior of the head Linux command. NR represents the total number of records
and is built into awk (Listing 2-29).
$ awk 'NR < 10' log.txt
Listing 2-29
Printing the first 10 lines of a file
You’ll often find it useful to combine grep and awk. For example, you
might want to first find the lines in a file containing the IP address 42.236.10.117
and then print the HTTP paths this IP made a request to (Listing 2-30).
$ grep "42.236.10.117" log.txt | awk '{print $7}'
Listing 2-30
Filtering an IP address and printing the seventh field, representing HTTP paths
Black Hat Bash (Early Access) © 2023 by Dolev Farhi and Nick Aleks